AITopics | factoid question

Collaborating Authors

factoid question

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Are Smaller Open-Weight LLMs Closing the Gap to Proprietary Models for Biomedical Question Answering?

Stachura, Damian, Konieczna, Joanna, Nowak, Artur

arXiv.org Artificial IntelligenceSep-24-2025

Open-weight versions of large language models (LLMs) are rapidly advancing, with state-of-the-art models like DeepSeek-V3 now performing comparably to proprietary LLMs. This progression raises the question of whether small open-weight LLMs are capable of effectively replacing larger closed-source models. We are particularly interested in the context of biomedical question-answering, a domain we explored by participating in Task 13B Phase B of the BioASQ challenge. In this work, we compare several open-weight models against top-performing systems such as GPT-4o, GPT-4.1, Claude 3.5 Sonnet, and Claude 3.7 Sonnet. To enhance question answering capabilities, we use various techniques including retrieving the most relevant snippets based on embedding distance, in-context learning, and structured outputs. For certain submissions, we utilize ensemble approaches to leverage the diverse outputs generated by different models for exact-answer questions. Our results demonstrate that open-weight LLMs are comparable to proprietary ones. In some instances, open-weight LLMs even surpassed their closed counterparts, particularly when ensembling strategies were applied. All code is publicly available at https://github.com/evidenceprime/BioASQ-13b.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2509.18843

Country:

Europe > Spain > Galicia > Madrid (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Europe > Poland > Lesser Poland Province > Kraków (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

LLM Ensemble for RAG: Role of Context Length in Zero-Shot Question Answering for BioASQ Challenge

Galat, Dima, Molla-Aliod, Diego

arXiv.org Artificial IntelligenceSep-11-2025

Biomedical question answering (QA) poses significant challenges due to the need for precise interpretation of specialized knowledge drawn from a vast, complex, and rapidly evolving corpus. In this work, we explore how large language models (LLMs) can be used for information retrieval (IR), and an ensemble of zero-shot models can accomplish state-of-the-art performance on a domain-specific Yes/No QA task. Evaluating our approach on the BioASQ challenge tasks, we show that ensembles can outperform individual LLMs and in some cases rival or surpass domain-tuned systems - all while preserving generalizability and avoiding the need for costly fine-tuning or labeled data. Our method aggregates outputs from multiple LLM variants, including models from Anthropic and Google, to synthesize more accurate and robust answers. Moreover, our investigation highlights a relationship between context length and performance: while expanded contexts are meant to provide valuable evidence, they simultaneously risk information dilution and model disorientation. These findings emphasize IR as a critical foundation in Retrieval-Augmented Generation (RAG) approaches for biomedical QA systems. Precise, focused retrieval remains essential for ensuring LLMs operate within relevant information boundaries when generating answers from retrieved documents. Our results establish that ensemble-based zero-shot approaches, when paired with effective RAG pipelines, constitute a practical and scalable alternative to domain-tuned systems for biomedical question answering.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2509.08596

Country:

Oceania > Australia (0.05)
Europe > Spain > Galicia > Madrid (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

Add feedback

Using Pretrained Large Language Model with Prompt Engineering to Answer Biomedical Questions

Zhou, Wenxin, Ngo, Thuy Hang

arXiv.org Artificial IntelligenceJul-9-2024

Our team participated in the BioASQ 2024 Task12b and Synergy tasks to build a system that can answer biomedical questions by retrieving relevant articles and snippets from the PubMed database and generating exact and ideal answers. We propose a two-level information retrieval and question-answering system based on pre-trained large language models (LLM), focused on LLM prompt engineering and response post-processing. We construct prompts with in-context few-shot examples and utilize post-processing techniques like resampling and malformed response detection. We compare the performance of various pre-trained LLM models on this challenge, including Mixtral, OpenAI GPT and Llama2. Our best-performing system achieved 0.14 MAP score on document retrieval, 0.05 MAP score on snippet retrieval, 0.96 F1 score for yes/no questions, 0.38 MRR score for factoid questions and 0.50 F1 score for list questions in Task 12b.

golden snippet, language model, snippet, (15 more...)

arXiv.org Artificial Intelligence

2407.06779

Country:

North America > United States > Georgia > Fulton County > Atlanta (0.04)
Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

ASQA: Factoid Questions Meet Long-Form Answers

Stelmakh, Ivan, Luan, Yi, Dhingra, Bhuwan, Chang, Ming-Wei

arXiv.org Artificial IntelligenceJan-22-2023

An abundance of datasets and availability of reliable evaluation metrics have resulted in strong progress in factoid question answering (QA). This progress, however, does not easily transfer to the task of long-form QA, where the goal is to answer questions that require in-depth explanations. The hurdles include (i) a lack of high-quality data, and (ii) the absence of a well-defined notion of the answer's quality. In this work, we address these problems by (i) releasing a novel dataset and a task that we call ASQA (Answer Summaries for Questions which are Ambiguous); and (ii) proposing a reliable metric for measuring performance on ASQA. Our task focuses on factoid questions that are ambiguous, that is, have different correct answers depending on interpretation. Answers to ambiguous questions should synthesize factual information from multiple sources into a long-form summary that resolves the ambiguity. In contrast to existing long-form QA tasks (such as ELI5), ASQA admits a clear notion of correctness: a user faced with a good summary should be able to answer different interpretations of the original ambiguous question. We use this notion of correctness to define an automated metric of performance for ASQA. Our analysis demonstrates an agreement between this metric and human judgments, and reveals a considerable gap between human performance and strong baselines.

annotator, machine learning, question answering, (20 more...)

arXiv.org Artificial Intelligence

2204.06092

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > France (0.05)
North America > Dominican Republic (0.04)
(11 more...)

Genre: Research Report (0.82)

Industry: Government > Voting & Elections (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.35)

Add feedback

The combination of context information to enhance simple question answering

Chao, Zhaohui, Li, Lin

arXiv.org Artificial IntelligenceOct-9-2018

Abstract--With the rapid development of knowledge base, question answering based on knowledge base has been a hot research issue. In this paper, we focus on answering singlerelation factoid questions based on knowledge base. We build a question answering system and study the effect of context information on fact selection, such as entity's notable type, outdegree. Experimental results show that context information can improve the result of simple question answering. Question answering (QA) is a classic natural language processing task, which aims at building systems that automatically answer questions formulated in natural language [1]. In recent years, several large-scale general purpose knowledge bases (KBs) have been constructed, including Freebase [2], YAGO [3], DBpedia [4] and Wikidata [5] .

information, machine learning, question answering, (17 more...)

arXiv.org Artificial Intelligence

1810.04

Country:

Asia > China > Hubei Province > Wuhan (0.05)
North America > United States > Hawaii > Honolulu County > Honolulu (0.05)
Asia > South Korea > Busan > Busan (0.04)

Genre: Research Report (0.70)

Industry:

Leisure & Entertainment (0.94)
Media > Music (0.69)
Government > Regional Government > North America Government > United States Government (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Artificial intelligence: ARC test focus goes beyond factoid questions

#artificialintelligenceMar-16-2018, 20:12:07 GMT

"Common sense" is a phrase everyone hears at one time or another, usually from an angry bystander who think you don't have any. "Humans use common sense to fill in the gaps of any question they are posed, delivering answers within an understood but non-explicit context," Swapna Krishna wrote in Engadget. Add a few years of developmental growth in the young child, and he or she acquires common sense but AI has problems. Calling out the challenge in AI research is Dr. Oren Etzioni, researcher and professor, who leads the Allen Institute for Artificial Intelligence, or AI2, in Seattle, Washington. To get at the fluidity that people have, their natural ability to move from one thing to the next, the programs need what every ten year old has in spades, he said, and that is called common sense---a set of facts, heuristics, observations, all the things that we can bring to the table, but the computer does not.

artificial intelligence, factoid question, natural language, (8 more...)

#artificialintelligence

Country: North America > United States > Washington > King County > Seattle (0.26)

Industry: Education (0.34)

Technology: Information Technology > Artificial Intelligence > Natural Language (0.35)

Add feedback